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Abstract 


Building  heating,  ventilation,  and  air  conditioning  (HVAC)  equipment  routinely  fails  to  satisfy 
performance  expectations  envisioned  at  design.  Such  failures  often  go  unnoticed  for  extended 
periods  of  time.  Additionally,  higher  expectations  are  being  placed  on  a combination  of  different 
and  often  conflicting  performance  measures,  such  as  energy  efficiency,  indoor  air  quality, 
comfort,  reliability,  limiting  peak  demand  on  utilities,  etc.  To  meet  these  expectations,  the 
processes,  systems,  and  equipment  used  in  both  commercial  and  residential  buildings  are 
becoming  increasingly  sophisticated.  This  development  both  necessitates  the  use  of  automated 
diagnostics  to  ensure  fault-free  operation  and  enables  diagnostic  capabilities  for  the  various 
building  systems  by  providing  a distributed  platform  that  is  powerful  and  flexible  enough  to 
perform  fault  detection  and  diagnostics  (FDD). 

The  purpose  of  the  research  effort  described  in  this  report  is  to  develop,  test,  and  demonstrate 
FDD  methods  that  can  detect  common  mechanical  faults  and  control  errors  in  air-handling  units 
(AHUs)  and  variable-air-volume  (VAV)  boxes.  The  tools  are  intended  to  be  sufficiently  simple 
that  they  can  be  embedded  in  commercial  building  automation  and  control  systems  and  rely  upon 
only  sensor  data  and  control  signals  that  are  commonly  available  in  these  systems. 

AHU  Performance  Assessment  Rules  (APAR)  is  a diagnostic  tool  that  uses  a set  of  expert  rules 
derived  from  mass  and  energy  balances  to  detect  faults  in  air-handling  units.  Control  signals  are 
used  to  determine  the  mode  of  operation  for  the  AHU.  A subset  of  the  expert  rules  corresponding 
to  that  mode  of  operation  is  then  evaluated  to  determine  if  there  is  a mechanical  fault  or  a control 
problem.  VAV  box  Performance  Assessment  Control  Charts  (VPACC)  is  a diagnostic  tool  that 
uses  statistical  quality  control  measures  to  detect  faults  or  control  problems  in  VAV  boxes. 

This  report  describes  a research  study  of  the  application  of  APAR  and  VPACC  to  HVAC 
systems  in  real  buildings.  AHU  and  VAV  box  data  were  collected  from  several  field  sites.  The 
study  examined  the  effectiveness  of  the  tools  in  detecting  commonly  found  mechanical  faults 
and  control  problems,  the  reliability  of  the  tools  across  different  building  uses  and  climate 
regions,  and  the  robustness  of  the  tools  in  handling  data  from  a variety  of  HVAC  system 
configurations.  APAR  and  VPACC  were  both  found  to  be  successful  at  finding  a wide  variety  of 
faults.  Both  tools  appear  to  be  suitable  for  embedding  in  commercial  control  products. 

Key  words:  BACnet,  building  automation  and  control,  direct  digital  control,  energy  management 
systems,  fault  detection  and  diagnostics,  cybernetic  building  systems 
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1 Introduction 


Building  HVAC  equipment  routinely  fails  to  satisfy  performance  expectations  envisioned 
at  design.  Such  failures  often  go  unnoticed  for  extended  periods  of  time.  Additionally, 
higher  expectations  are  being  placed  on  a combination  of  different  and  often  conflicting 
performance  measures,  such  as  energy  efficiency,  indoor  air  quality,  comfort,  reliability, 
limiting  peak  demand  on  utilities,  etc.  To  meet  these  expectations,  the  processes,  systems, 
and  equipment  used  in  both  commercial  and  residential  buildings  are  becoming 
increasingly  sophisticated.  This  development  both  necessitates  the  use  of  automated 
diagnostics  to  ensure  fault-free  operation  and  enables  diagnostic  capabilities  for  the 
various  building  systems  by  providing  a distributed  platform  that  is  powerful  and  flexible 
enough  to  perform  fault  detection  and  diagnostics  (FDD). 

The  purpose  of  the  research  effort  described  in  this  report  is  to  develop,  test,  and 
demonstrate  FDD  methods  that  can  detect  common  mechanical  faults  and  control  errors 
in  air-handling  units  (AHUs)  and  variable-air-volume  (VAV)  boxes.  The  tools  are 
intended  to  be  sufficiently  simple  that  they  can  be  embedded  in  commercial  building 
control  systems  and  rely  upon  only  sensor  data  and  control  signals  that  are  commonly 
available  in  commercial  building  automation  and  control  systems. 

AHU  Performance  Assessment  Rules  (APAR)  is  a diagnostic  tool  that  uses  a set  of  expert 
rules  derived  from  mass  and  energy  balances  to  detect  common  faults  in  air-handling 
units.  Control  signals  are  used  to  determine  the  mode  of  operation  for  the  AHU.  A subset 
of  the  expert  rules  corresponding  to  that  mode  of  operation  is  then  evaluated  to  determine 
if  there  is  a mechanical  fault  or  a control  problem.  VAV  box  Performance  Assessment 
Control  Charts  (VPACC)  is  a diagnostic  tool  that  uses  statistical  quality  control  measures 
to  detect  faults  or  control  problems  in  VAV  boxes.  VPACC  can  be  applied  to  most  VAV 
box  control  strategies.  Fault  thresholds  are  determined  by  statistical  analysis  of  a database 
of  “normal  operation”  data.  The  FDD  tools  for  AHUs  and  VAV  boxes  are  being 
developed  with  distinct  approaches  because  of  the  nature  of  the  systems.  VAV  boxes  are 
simple  devices  with  a limited  number  of  operation  modes  and  possible  faults.  Because  the 
building  industry  is  sensitive  to  first  cost,  the  VAV  boxes  typically  have  little 
instrumentation  and  controllers  with  limited  capability.  However,  VAV  boxes  are  very 
numerous  in  a typical  HVAC  system,  resulting  in  a large  amount  of  data  to  be  monitored 
for  faults.  AHUs  are  more  complex  and  thus  susceptible  to  more  kinds  of  faults.  They 
also  tend  to  have  more  instrumentation  and  more  capable  controllers.  The  FDD  tools  for 
both  systems  are  designed  to  be  robust  so  that  they  can  adapt  to  the  variety  of 
applications  typical  of  their  use. 

A previous  study  [1],  describes  the  theoretical  basis  of  APAR  and  VPACC  and  evaluates 
the  tools’  performance  using  data  generated  by  simulation,  emulation,  and  laboratory 
testing.  The  study  examined  the  breadth  of  faults  that  can  be  detected  and  the  conditions 
under  which  they  can  be  detected.  This  report  describes  the  results  of  a research  study  to 
extend  the  previous  work  by  evaluating  the  application  of  APAR  and  VPACC  to  HVAC 
systems  in  real  buildings.  The  research,  involving  the  collection  of  AHU  and  VAV  box 
data  from  several  field  sites,  examined  the  effectiveness  of  these  tools  in  detecting 
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commonly  found  mechanical  faults  and  control  problems,  the  reliability  of  the  tools 
across  several  seasons,  and  the  robustness  of  the  tools  in  handling  data  from  a variety  of 
system  types  and  configurations.  The  sites  include  an  office  building,  a restaurant,  as 
well  as  community  college  and  university  campuses,  featuring  constant-  and  variable-air- 
volume  systems. 

2 Methodology 

2.1  FDD  for  Air  Handling  Units 

The  fault  detection  tool  described  in  this  section  was  developed  for  application  to  single 
duct  variable-volume  or  constant-volume  air  handlers  with  hydronic  heating  and  cooling 
coils  and  airside  economizers.  The  rules  that  are  used  for  FDD  focus  on  temperature 
control  in  an  AHU.  Hence,  the  system  description  will  be  restricted  to  components  and 
control  strategies  directly  related  to  temperature  control.  Figure  2.1  is  a schematic 
diagram  of  a typical  single  duct  air  handling  unit  (AHU). 


Outdoor  Air  Temperature 
& Humidity  Sensors 


Figure  2.1:  Schematic  diagram  of  a single  duct  air-handling  unit 
2.1.1  System  Description 

The  AHU  controller  typically  controls  the  supply  air  temperature  to  maintain  a setpoint 
temperature  at  a location  in  the  supply  duct  downstream  of  the  supply  fan.  Outdoor  air 
enters  the  AHU  and  is  mixed  with  air  returned  from  the  building.  The  mixed  air  passes 
over  the  heating  and  cooling  coils,  where  if  necessary,  it  is  conditioned  prior  to  being 
supplied  to  the  building.  The  typical  operating  sequence  for  AHUs  consists  of  four 
primary  modes  of  operation  during  occupied  periods  for  maintaining  the  supply  air 
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temperature  and  the  ventilation  at  preset  levels.  The  relationship  of  the  four  operating 
modes  to  the  control  of  the  heating  coil  valve,  the  cooling  coil  valve  and  the  mixing  box 
dampers  is  shown  in  Figure  2.2.  Sequencing  logic  determines  the  mode  of  operation  as 
dictated  by  various  thermal  relationships  including  the  internal  and  external  loads  on  the 
zones  served  by  the  AHU. 

In  the  heating  mode  (Mode  1 in  Figure  2.2),  the  heating  coil  valve  is  controlled  to 
maintain  the  supply  air  temperature  at  the  heating  set  point  and  the  cooling  coil  valve  is 
closed.  The  mixed  air  dampers  are  positioned  to  allow  the  minimum  outdoor  air 
necessary  to  satisfy  ventilation  requirements. 

3 Mode  4 
* 


Cooling 
Coil  Valve 


^ 

Figure  2.2:  Typical  operating  modes  of  an  air-handling  unit 

As  cooling  loads  increase,  the  AHU  transitions  from  heating  to  cooling  with  outside  air 
(Mode  2).  In  this  mode,  the  heating  and  cooling  coil  valves  are  closed  and  the  mixing  box 
dampers  are  modulated  to  maintain  the  supply  air  temperature  at  cooling  set  point.  As  the 
loads  continue  to  increase,  the  mixing  dampers  eventually  saturate  with  the  outdoor  air 
damper  fully  open  and  the  AHU  changes  over  to  mechanical  cooling.  When  the  AHU  is 
operating  in  one  of  the  mechanical  cooling  modes  (Modes  3 and  4),  the  cooling  coil  valve 
modulates  to  maintain  the  supply  air  temperature  at  cooling  set  point,  the  heating  coil 
valve  is  closed,  and  the  outdoor  air  damper  is  either  fully  open  or  at  its  minimum 
position.  There  are  several  different  types  of  economizer  controls,  generally  the 
economizer  control  logic  uses  a comparison  of  the  outdoor  and  return  air  temperatures  or 
enthalpies  to  determine  the  proper  position  of  the  outdoor  air  damper  such  that 
mechanical  cooling  requirements  are  minimized.  Hence,  the  third  primary  mode  (Mode 
3)  of  operation  is  mechanical  cooling  with  100  % outdoor  air  and  the  fourth  primary 
mode  (Mode  4)  of  operation  is  mechanical  cooling  with  minimum  outdoor  air. 

2.1.2  AHU  Performance  Assessment  Rules  (APAR) 

The  basis  for  the  fault  detection  methodology  is  a set  of  expert  rules  used  to  assess  the 
performance  of  the  AHU.  The  tool  developed  from  these  rules  is  referred  to  as  APAR 
(AHU  Performance  Assessment  Rules).  APAR  uses  control  signals  and  occupancy 
information  to  identify  the  mode  of  operation  of  the  AHU,  thereby  identifying  a subset  of 
the  rules  that  specify  temperature  relationships  that  are  applicable  for  that  mode.  The  two 
main  mode  classifications  are  occupied  and  unoccupied.  For  occupied  periods,  the  mode 
is  further  categorized  as  described  in  the  previous  paragraph.  For  convenience,  the 
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operating  modes  are  summarized  below: 

• Mode  1 : heating 

• Mode  2:  cooling  with  outdoor  air 

• Mode  3:  mechanical  cooling  with  100  % outdoor  air 

• Mode  4:  mechanical  cooling  with  minimum  outdoor  air 

• Mode  5:  unknown 

Because  the  direct  digital  control  (DDC)  output  to  the  actuators  of  the  heating  and 
cooling  coil  valves  and  the  mixing  box  dampers  are  known,  the  mode  of  operation  can  be 
ascertained.  Although  not  depicted  in  Figure  2.2,  a fifth  mode  of  operation  referred  to 
“unknown”  operation  has  been  defined  and  listed  above.  The  unknown  mode  applies  to 
the  case  in  which  the  AHU  is  running  in  an  occupied  mode,  but  none  of  the  control 
output  relationships  defined  for  Modes  1-4  are  satisfied.  The  unknown  mode  could  be 
associated  with  mode  transitions  and/or  with  faulty  operation  such  as  simultaneous 
heating  and  cooling. 

Once  the  mode  of  operation  has  been  established,  rules  based  on  conservation  of  mass 
and  energy  can  be  used  along  with  the  sensor  information  that  is  typically  available  for 
controlling  the  AHUs.  For  example,  normal  operation  in  the  mechanical  cooling  mode 
with  100  % outdoor  air  (Mode  3)  dictates  that  the  outdoor  and  mixed  air  temperatures 
must  be  approximately  equal.  Defining  Toa  and  Tma  as  the  outdoor  air  and  mixed  air 
temperatures,  respectively,  the  rule  (defined  as  Rule  10)  is  written  as 

Rule  10:  I Toa  - Tma I > £t 

where  £t  is  a threshold  that  depends  on  the  uncertainty  (or  accuracy)  of  the 
measurements.  The  rules  are  written  such  that  a fault  is  indicated  if  a rule  is  true.  In  the 
example  above,  the  rule  states  that  if  the  outdoor  and  mixed  air  temperatures  are  not  the 
same  (i.e.,  if  true)  a fault  has  occurred. 

As  a detailed  description  of  the  28  APAR  rules  and  the  reasoning  behind  them  is 
available  elsewhere  [2],  the  rules  are  simply  listed  in  Table  2.1  without  detailed 
explanation.  Table  2.1  groups  the  rules  according  to  mode  of  operation.  As  indicated  in 
the  column  heading  for  the  rule  expression,  a true  expression  indicates  a fault.  Table  2.2 
presents  the  rules  as  related  groups  and  indicates  the  sensors  and  control  signals  used  to 
evaluate  each  rule.  The  first  group  of  rules  treats  the  relationship  of  temperatures  in  the 
coil  subsystem  of  the  AHU.  For  these  four  rules,  only  the  relational  operator  in  the  rules 
change  from  one  mode  to  another.  A typical  rule  from  this  subgroup  requires  the  supply 
air  temperature  to  be  lower  than  the  sum  of  the  mixed  air  temperature  and  the 
temperature  rise  across  the  supply  fan  in  the  mechanical  cooling  modes.  There  are  also 
groups  of  rules  treating  the  mixing  box  subsystem,  the  zone  subsystem,  economizer 
operation,  comfort  requirements,  and  controller  logic/tuning.  Hence,  although  there  are 
28  rules,  in  reality  only  a small  number  of  temperature  and  control  signal  relationships 
are  used  to  define  the  rules. 


Table  2.1:  APAR  Rule  Set 


Mode 

Rule  # 

Rule  Expression  (true  implies  existence  of  a fault) 

Heating 
(Mode  1) 

1 

La  < Tma  + ATsf  £t 

2 

For  1 Tra  - Toa\  > AT min:  \QJQsa  - (Qoc/QsaLim  1 > £f 

3 

\ lihc  ~ T — £hc  La,s  ~ ^ sa  — £t 

4 

^ llhc  ~ ^ ^ — £hc 

Cooling  with 
Outdoor  Air 
(Mode  2) 

5 

Toa  > Tsa,s  ' ATsf  + £t 

6 

La>Tra-  ATrf+  £t 

7 

1 La  - ATsf-  Tma\  > £t 

Mechanical 
Cooling  with 
100%  Outdoor 
Air 

(Mode  3) 

8 

Toa  < TSa,s  - ATsf  ' £t 

9 

Toa  ^ T co  £r 

10 

1 Toa-Tma\>£t 

11 

Tsa  > Tma  + ATsj  + £t 

12 

La  > La  - ATrf+  £t 

13 

^cc  — T — £cc  T sa  — Tsa  s 2? 

14 

\ licc  — 1 ^ — £cc 

Mechanical 
Cooling  with 
Minimum 
Outdoor  Air 
(Mode  4) 

15 

T <T  - f 

1 oa  ^ 1 co  °t 

16 

Tsa  > Tma  + ATsf  + £t 

17 

La  > La  ■ ATrf  + £t 

18 

For  1 Tra  - Toa  1 > AT min\  1 Qoa/Qsa  - (Qoa/Qsa)min  1 > £f 

19 

\ucc  ~ — £cc  T sa  ~ Tsa  s — £f 

20 

\ 11  cc  ~ — £cc 

Unknown 

Occupied 

Modes 
(Mode  5) 

21 

ucc  '>  £cc  ttnd  uilc  > £yic  and  £j  < wj  < 1 - £j 

22 

llhc  '>  £hc  ucc  ^ £cc 

23 

uhc  ^ £hc  an<^  ud  £d 

24 

£d  < ud<  1 ~ £d  and  llcc  > £cc 

All  Occupied 
Modes 

(Mode  1,  2,  3,  4, 
or  5) 

25 

1 La  ~ La>s  1 > £t 

26 

Tma  ^ min(Tra , Toa)  - £t 

27 

Tma  ^ WlClx(T ra  , T oa)  + £t 

28 

Number  of  mode  transitions  per  hour  > MTmax 
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Where 
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max 
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ma 


T 
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oa 


T, 


CO 


T 


sa.s 


AT, 


sf 


ATrf 

ATmin 

Qoc/Qsa 

(Qoc/QsJ, 

uhc 

ucc 


ud 


$ 

% 

^ he 
& cc 
£d 


maximum  number  of  mode  changes  per  hour 

supply  air  temperature 

mixed  air  temperature 

return  air  temperature 

outdoor  air  temperature 

changeover  air  temperature  for  switching  between  Modes  3 and  4 
supply  air  temperature  set  point 
temperature  rise  across  the  supply  fan 
temperature  rise  across  the  return  fan 

threshold  on  the  minimum  temperature  difference  between  the 
return  and  outdoor  air 

outdoor  air  fraction  = (Tma  - Tra)/(Toa  - Tra) 
threshold  on  the  minimum  outdoor  air  fraction 
normalized  heating  coil  valve  control  signal  [0,1]  where  u/u.  = 0 
indicates  the  valve  is  closed  and  uhc  = 1 indicates  it  is  100  % open 
normalized  cooling  coil  valve  control  signal  [0,1]  where  acc  = 0 
indicates  the  valve  is  closed  and  ucc  = 1 indicates  it  is  100  % open 
normalized  mixing  box  damper  control  signal  [0,1]  where  uj  = 0 
indicates  the  outdoor  air  damper  is  closed  and  u j = 1 indicates  it  is 
100  % open 

threshold  for  errors  in  temperature  measurements 
threshold  parameter  accounting  for  errors  related  to  airflows 
(function  of  uncertainties  in  temperature  measurements) 
threshold  parameter  for  the  heating  coil  valve  control  signal 
threshold  parameter  for  the  cooling  coil  valve  control  signal 
threshold  parameter  for  the  mixing  box  damper  control  signal 


2.1.2. 1 Operational  and  Design  Data  Requirements 

APAR  uses  the  following  occupancy  information,  setpoint  values,  sensor  measurements, 
and  control  signals: 


Occupancy  status; 

Supply  air  temperature  set  point; 
Supply  air  temperature; 

Return  air  temperature; 

Mixed  air  temperature; 

Outdoor  air  temperature; 

Cooling  coil  valve  control  signal; 
Heating  coil  valve  control  signal; 


• Mixing  box  damper  control 
signal; 

• Return  air  relative  humidity 
(for  enthalpy-based  economizers 
only) 

• Outdoor  air  relative  humidity 
(for  enthalpy-based  economizers 
only). 
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Table  2.2:  Summary  of  Rule  Relationships 


Rule 

Mode  * 

Sensors  and  Control  Signals 

Relationship  Between  Grouped  Rules 

Tsa 

Tra 

T ma 

Toa 

Tsa.s 

ATsf 

ATrf 

Tco 

Ucc 

Uhc 

Ud 

1 

1 

1 

i 

1 

Coil  Subsystem:  The  relational  sign  (<,  >,  etc.) 
changes  based  on  the  mode  of  operation. 

7 

2 

1 

i 

D 

ii 

3 

■ 

i 

■ 

16 

4 

1 

i 

1 

2 

1 

i 

i 

i 

Mixing  Box  Subsystem:  Rules  are  related 
through  calculation  of  outdoor  air  fraction.  If 

Rule  26  or  27  is  satisfied,  the  outdoor  air 
fraction  will  be  negative  or  greater  than  unity. 

18 

4 

1 

i 

1 

26 

1. 2,  3.  4 

■ 

■ 

n 

27 

1.  2,  3,  4 

« 

i 

i 

25 

1. 2,  3.  4 

i 

1 

Comfort  Requirements:  The  first  four  rules 
indicate  comfort  is  sacrificed  (with  Rules  3,  13, 
and  19  indicating  the  system  is  out  of  control), 
whereas  the  latter  three  rules  indicate  comfort 
could  soon  be  sacrificed  (system  is  out  of 
control). 

3 

1 

1 

1 

1 

13 

3 

1 

1 

i 

19 

4 

i 

1 

1 

4 

1 

1 

14 

3 

i 

20 

4 

1 

5 

2 

■ 

n 

i 

The  relational  sign  (<,  >,  etc.)  changes  based 
on  the  mode  of  operation. 

8 

3 

i 

■ 

■ 

6 

2 

■ 

1 

1 

Zone  Subsystem:  Rules  are  identical. 

12 

3 

I 

i 

■ 

17 

4 

i 

1 

1 

9 

3 

i 

1 

Economizer:  The  relational  sign  (<,  >,  etc.) 
changes  based  on  the  mode  of  operation. 

15 

4 

i 

1 

10  1 

3 

i 

i 

21 

i 

i 

i 

Controller  LogicATuning:  Rules  are  related  and 
identify  periods  of  operation  associated  with 
controller  problems,  such  as  simultaneous 
heating  and  cooling,  and  excessive  mode 
changes. 

22 

- 

■ 

i 

23 

- 

i 

1 

24 

i 

1 

28 

i 

■ 

1 

* The  dash  symbol  indicates  either  an  unknown  mode  or  multiple  modes  of  operation. 


With  the  possible  exception  of  the  mixed  air  temperature,  this  information  is  generally  available 
for  most  AHUs  controlled  with  a DDC  system.  If  one  or  more  sensors  are  not  available,  certain 
rules  will  no  longer  be  applicable.  For  instance,  in  the  absence  of  a mixed  air  temperature  sensor, 
nine  rules  listed  in  Table  2.1  (Rules  1,  2,  7,  10,  11,  16,  18,  26,  and  27)  will  be  eliminated  from 
consideration  in  APAR.  Conversely,  the  presence  of  additional  sensors  would  expand  the  rule  set 
and  provide  an  opportunity  to  either  detect  more  faults,  or  to  detect  faults  during  modes  of 
operation  in  which  they  would  normally  be  hidden.  For  instance,  if  a temperature  sensor  was 
installed  between  the  heating  and  cooling  coils,  leakage  through  the  heating  valve  could  be 
detected  during  the  mechanical  cooling  modes,  whereas  normally  it  would  be  masked  in  these 
modes. 


In  addition  to  the  operational  data  listed  above,  certain  design  data  are  needed  to  implement  the 

rules.  The  required  design  data  are: 

• Minimum  and  maximum  values  of  control  signals  for  the  heating  coil  valve,  cooling  coil 
valve  and  mixing  box  dampers  for  normalizing  the  control  signals; 

• Percentage  outdoor  air  necessary  to  satisfy  ventilation  requirements; 

• Changeover  temperature  from  mechanical  cooling  with  100  % outdoor  air  to  mechanical 
cooling  with  minimum  outdoor  air  (or  equivalent  condition  for  enthalpy-based  economizer); 

• Description  of  sequencing/economizer  cycle  strategy  (used  to  verify  that  the  rules  are 
suitable  to  a particular  AHU  installation). 
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2. 1.2.2  Detecting  and  Diagnosing  Faults 

APAR  does  not  search  for  the  existence  of  a specific  set  of  faults.  Rather,  any  fault  that  causes  a 
rule  to  be  satisfied  would  be  detected  and  additional  effort  would  be  necessary  to  isolate  the 
source  of  the  problem.  Emulation,  simulation,  and  laboratory  experimentation  from  previous 
work  [1]  as  well  as  the  current  study  demonstrate  that  the  rule  set  can  identify  the  following 
faults: 

• Stuck  or  leaking  mixing  box  dampers,  heating  coil  valves,  and  cooling  coil  valves; 

• Temperature  sensor  faults; 

• Design  faults  such  as  undersized  coils; 

• Controller  programming  errors  related  to  tuning,  setpoints,  and  sequencing  logic; 

• Inappropriate  operator  intervention. 

The  operating  point,  severity  of  a fault,  and  threshold  selection  for  the  rules  will  obviously 
influence  when  a particular  rule  is  satisfied.  Threshold  selection  is  discussed  next. 

2. 1.2.3  Threshold  Selection 

In  addition  to  the  sensor,  control  signals,  and  setpoint  information,  there  are  other  parameters 
that  must  be  specified  for  APAR.  For  instance,  estimates  of  the  temperature  rise  across  the 
supply  fan  (and  return  fan,  if  one  exists)  must  be  provided,  a reasonable  default  is  1.1  °C  (2.0  °F). 
A model-based  value  correlated  to  the  airflow  rate  or  the  control  signal  to  the  fan  could  be  used 
as  the  basis  for  this  estimate;  however,  some  amount  of  training  data  would  likely  be  necessary 
to  establish  the  correlation.  Thresholds  used  in  evaluation  of  rules  such  as  £t  in  Rule  10  must  also 
be  specified.  Another  approach  might  be  to  calculate  the  threshold  values  based  on  the 
uncertainty  of  each  sensor  or  actuator  value.  As  an  example,  the  threshold  in  Rule  10  would  be 
determined  from  the  expression 

£t  — £t  + £t 

1 1 oa  1 ma 

where  £r  and  £Y  are  the  uncertainties  associated  with  the  measurement  of  the  outdoor  and 
mixed  air  temperatures.  If  a threshold  is  too  great,  the  associated  fault(s)  must  be  relatively 
severe  to  be  detected.  If,  on  the  other  hand,  a threshold  is  too  small,  normal  variation  in 
operating  conditions  may  result  in  false  alarms.  These  threshold  values  are  currently  determined 
heuristically  for  each  site. 

2.1.3  Instrumentation  Accuracy  Requirements 

APAR  uses  existing  sensor  points  in  the  control  system  to  perform  the  fault  detection 
calculations.  The  typical  industrial  grade  sensors  that  are  already  installed  for  control  purposes 
have  sufficient  accuracy.  Laboratory  grade  instruments  are  not  required.  Higher  quality  sensors 
that  have  been  installed  and  calibrated  properly  will  allow  the  use  of  tighter  thresholds  (less 
severe  faults  can  be  detected)  than  lower  quality  sensors,  or  those  that  have  been  poorly 
calibrated  or  installed. 


2.2  FDD  for  VAV  Boxes 


2.2.1  VAV  box  Performance  Assessment  Control  Charts  - VPACC 

The  primary  purpose  of  heating,  ventilating,  and  air-conditioning  (HVAC)  equipment  in 
commercial  buildings  is  to  provide  a comfortable  and  healthy  environment  for  occupants. 
Variable-air-volume  (VAV)  air  handling  systems  are  common  for  conditioning  air  and  delivering 
the  air  to  occupied  zones.  VAV  boxes  are  an  integral  part  of  such  systems  and  are  the  final  piece 
of  equipment  that  air  passes  through  prior  to  reaching  the  occupants.  As  such,  it  is  important  to 
ensure  that  these  devices  operate  correctly. 

The  challenges  presented  in  detecting  and  diagnosing  faults  in  VAV  boxes  are  similar  to  those 
encountered  with  other  pieces  of  HVAC  equipment.  Generally  there  are  very  few  sensors, 
making  it  difficult  to  ascertain  what  is  happening  in  the  device.  Limitations  associated  with 
controller  memory  and  communication  capabilities  further  complicate  the  task.  The  number  of 
different  types  of  VAV  boxes  and  lack  of  standardized  control  sequences  add  a final  level  of 
complexity  to  the  challenge.  This  set  of  constraints  is  counterbalanced  by  the  fact  that  VAV 
boxes  are  much  more  numerous  than  other  pieces  of  HVAC  equipment.  For  instance,  buildings 
may  have  ten  to  fifteen  times  more  VAV  boxes  than  air-handling  units.  Hence,  maintenance 
staffs  would  clearly  benefit  from  a tool  that  assisted  them  in  monitoring  VAV  box  operation. 

The  needs  and  constraints  described  above  have  led  to  the  development  of  VAV  Box 
Performance  Assessment  Control  Charts  (VPACC),  a fault  detection  tool  that  uses  a small 
number  of  control  charts  to  assess  the  performance  of  VAV  boxes.  The  underlying  approach, 
while  developed  for  a specific  type  of  VAV  box  and  control  sequence,  is  general  in  nature  and 
can  be  adapted  to  other  types  of  VAV  boxes.  This  section  describes  the  basic  concept  of  control 
charts  and  their  use  for  determining  when  control  processes  have  gone  “out  of  control”.  The 
specific  control  charts  developed  and  implemented  in  VPACC  are  then  presented  for  a single 
duct  pressure-independent  throttling  VAV  box  with  reheat. 

2.2.2  Control  Charts 

Control  charts  are  common  tools  for  monitoring  control  processes  wherein  a measured  quantity 
is  compared  to  upper  and  lower  limits  that  define  allowable  (or  fault  free)  operation.  If  the 
measured  quantity  falls  outside  these  limits,  the  process  is  said  to  be  “out  of  control.”  The  limits 
are  typically  defined  using  statistical  parameters  and,  therefore,  control  charts  are  often  referred 
to  as  statistical  quality  control  charts. 

There  are  many  different  types  of  control  charts.  VPACC  implements  an  algorithm  known  as  a 
CUSUM  (cumulative  sum)  chart.  The  basic  concept  behind  CUSUM  charts  is  to  accumulate  the 
error  between  a process  output  and  the  expected  value  of  the  output.  Large  values  of  the 
accumulated  error  are  indicative  of  an  out  of  control  process.  With  the  process  output  at 
sampling  time  i denoted  Xi,  the  estimate  of  the  expected  value  denoted  x , and  the  estimate  of  the 
process  standard  deviation  denoted  by  a , the  normalized  process  output  is  given  by: 
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The  normalized  process  output  is  used  to  compute  two  cumulative  sums  defined  as  follows: 

Sj  = max  [ 0,  Zj  - k + S j_j  ] (2a) 

Tj  = min  [ 0,  z , + k + TM  ] (2b) 

where  k is  a slack  parameter  that  must  be  specified.  Positive  values  of  z greater  than  k cause  the 
sum  S to  move  away  from  zero  and  the  sum  T to  approach  or  remain  at  zero.  Negative  values  of  z 
less  than  -k  cause  the  sum  T to  move  away  from  zero  and  the  sum  S to  approach  or  remain  at 
zero.  A process  is  said  to  be  out  of  control  when  either  S exceeds  a threshold  value  defined  by 
the  parameter  h,  or  T falls  below  -h.  Figure  2.3  [3]  presents  normalized  data  and  the  S and  T 
cumulative  sums  for  k = 0.5  and  h = 5.  The  first  20  data  points  come  from  a random  normal 
distribution  with  a mean  value  of  zero  and  a standard  deviation  of  unity.  The  mean  value  is  then 
increased  to  0.25,  0.5,  0.75  and  1.0  for  subsequent  sets  of  20  data  points.  Note  that  S exceeds  the 
threshold  value  of  h after  about  68  data  points.  Because  the  mean  value  increases  above  0,  the 
cumulative  sum  T remains  above  its  threshold  of  -5. 


CUSUM  charts  are  generally  considered  to  be  effective  for  detecting  gradual  shifts  in  the  process 
mean.  The  most  commonly  used  control  charts  are  Shewhart  and  Shewhart-type  charts.  Shewhart 
charts  are  effective  for  detecting  large,  sudden  changes  in  the  process  mean.  Generally  Shewhart 
chart  limits  are  set  at  values  of  x ± 3 6 . In  terms  of  the  normalized  parameter  z,  the  chart  limits 
are  z - ± 3 . Shewhart  charts  were  not  investigated  as  part  of  this  study;  however,  it  is  interesting 
to  note  that  the  basic  CUSUM  and  Shewhart  charts  are  equivalent  if  the  CUSUM  parameters  k 
and  h are  selected  as  k = 3 and  h = 0. 


Data  Point  Data  Point 

Figure  2.3:  A simple  CUSUM  control  chart  indicating  an  “out  of  control”  process. 
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2.23  System  Description 

Figure  2.4  is  a schematic  diagram  of  a typical  single  duct  variable-air-volume  (VAV)  box  with 
hydronic  reheat.  The  diagram  depicts  a damper  that  is  used  to  modulate  airflow  to  the  zone  and  a 
control  valve  that  modulates  hot  water  flow  to  the  reheat  coil.  Several  sensors  are  also  shown  in 


Figure  2.4:  Schematic  diagram  of  a single  duct  pressure-independent  VAV  box  with 
hydronic  reheat. 

Figure  2.4.  The  zone  thermostat  measures  the  air  temperature  in  the  zone.  The  differential 
pressure  transducer  is  used  to  measure  the  flow  rate  of  air  into  the  zone.  Finally,  the  discharge  air 
temperature  sensor  measures  the  temperature  of  the  air  stream  entering  the  zone.  This  sensor  is 
used  to  provide  diagnostic  information  rather  than  for  control  purposes.  The  VAV  box  controller 
reads  the  sensor  information,  computes  control  outputs  for  the  damper  and  reheat  valve,  and 
transmits  these  signals  to  the  appropriate  actuators. 

Figure  2.5  shows  a typical  control  sequence  for  a pressure-independent  VAV  box.  A heating  set 
point  and  a cooling  set  point  are  specified.  As  the  zone  temperature  increases  above  the  cooling 
set  point,  the  airflow  rate  to  the  zone  increases  proportionally.  This  is  accomplished  by  resetting 
the  airflow  rate  setpoint  and  modulating  the  damper  to  achieve  this  airflow  rate.  As  the  zone 
temperature  decreases  toward  the  cooling  set  point,  the  airflow  rate  setpoint  is  decreased  and  the 
damper  gradually  closes  until  it  is  providing  the  minimum  flow  rate  necessary  for  ventilation.  If 
the  room  temperature  continues  to  decrease  and  reaches  the  heating  set  point,  the  reheat  valve 
will  begin  to  open.  The  airflow  rate  can  also  be  varied  in  the  heating  mode,  with  the  airflow 
increasing  as  the  temperature  decreases.  Alternatively,  a higher  fixed  airflow  rate  may  be 
specified  for  heating  operation  to  improve  the  distribution  of  the  warm  air.  In  Figure  2.5,  it  is 
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assumed  that  a fixed  airflow  rate  associated  with  the  ventilation  requirement  of  the  room  is 
provided  in  the  heating  mode. 


Valve  Airflow 


Figure  2.5:  Damper  and  valve  control  sequence  as  a function  of  room  temperature  for  a 
single  duct  pressure-independent  VAV  box  with  hydronic  reheat. 

2.2.4  CUSUM  Applied  to  VAV  Box  Diagnostics 

The  previous  section  described  one  particular  VAV  box  control  strategy.  However,  a wide 
variety  of  control  strategies  are  employed  by  controller  manufacturers,  most  of  which  use  a 
cascaded  control  loop  to  maintain  the  zone  temperature  and  zone  airflow  rate  at  setpoint  values. 
In  order  to  make  VPACC  independent  of  the  control  strategy  used  in  a particular  controller/VAV 
box  application,  four  generic  errors  were  identified:  the  airflow  rate  error,  the  absolute  value  of 
the  airflow  rate  error,  the  temperature  error,  and  the  reheat  coil  differential  temperature  error.  As 
long  as  the  VAV  box  controller  has  an  airflow  setpoint,  as  well  as  heating  and  cooling 
temperature  setpoints,  VPACC  will  function  independently  of  the  control  strategy  used. 
Common  mechanical  and  control  faults  will  result  in  a deviation  of  one  or  more  of  these  errors 
from  its  value  during  normal  operation,  which  can  be  detected  by  a CUSUM  chart. 

The  airflow  rate  error,  Qerwr,  is  defined  as 

Qerror  ~ Qactual  ~ Qsetpoint  (3) 

where 

Qactual  = measured  airflow  rate 
Qsetpomt  = airflow  rate  set  point. 
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The  CUSUMs  of  this  error,  Sq  and  Tq,  are  effective  for  detecting  damper  faults  and  differential 
pressure  sensor  faults  associated  with  airflow  measurement. 

The  absolute  value  of  the  airflow  rate  error,  I QemJ,  is  defined  as 


I Q erroA  ~ I Qactual  ~ Qsetpoint I 


(4) 


Only  one  CUSUM  value,  S\q\,  is  defined  for  this  error  since  the  error  is  never  negative.  Siqi  is 
effective  for  detecting  unstable  damper  control  faults. 


The  temperature  error,  Terror,  is  defined  as 

Terror  =Tzone- CSP 

: If  Tzone  > CSP 

(5a) 

Termr  = 0 

: If  HSP  < TZone  < CSP 

(5b) 

Terror  = Tzone  -HSP 

: If  Tzone  < HSP 

(5c) 

where 


TZOne  = zone  temperature 

CSP  = cooling  set  point 

HSP  = heating  set  point. 


The  CUSUMs  of  the  temperature  error,  St  and  7>,  are  effective  for  detecting  damper  faults,  valve 
faults,  and  temperature  sensor  faults.  The  specific  definition  of  temperature  error  used  in  this 
report  is  based  on  the  control  sequence  described  above.  Various  other  commonly  used  control 
sequences  may  require  changes  to  the  definitions  of  heating  setpoint,  cooling  setpoint,  and 
temperature  error. 

The  reheat  coil  differential  temperature  error,  ATerror,  is  defined  as 


A Terror  = Td  ischarge  T entering 

ATerror  = 0 


If  Uhc  ~ 0 
If  Uhc  ± 0 


(6a) 

(6b) 


where 

T discharge  = discharge  air  temperature  (the  temperature  of  the  air  leaving  the  reheat  coil) 
T entering  - entering  air  temperature  (the  temperature  of  the  air  entering  the  reheat  coil) 
Uhc  — control  signal  to  the  reheat  coil  valve. 


The  positive  CUSUM  of  the  reheat  coil  differential  temperature  error.  Sat,  is  effective  for 
detecting  a leaking  reheat  coil  valve  fault.  The  negative  CUSUM,  Tat,  is  effective  for  detecting 
temperature  sensor  faults.  The  leaking  valve  fault  highlights  the  advantages  of  automated  FDD. 
Without  VP  ACC,  the  local  controller  may  be  capable  of  masking  this  fault  by  increasing  the 
airflow  rate  into  the  space.  In  this  scenario  there  will  be  no  “too  hot”  or  “too  cold”  complaints,  so 
a significant  energy  penalty  may  be  accrued. 
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The  errors  and  CUSUMs  are  only  calculated  during  occupied  periods.  During  unoccupied 
periods,  the  errors  are  not  computed,  and  the  CUSUMs  are  reset  to  zero.  The  first  hour  of  the 
occupied  period  is  treated  the  same  as  the  unoccupied  period,  to  allow  steady  state  conditions  to 
develop. 

2.2.5  Point  requirements 

Most  of  the  points  required  by  VP  ACC  are  already  available  in  the  local  VAV  box  controller: 
room  temperature,  cooling  setpoint,  heating  setpoint,  airflow  rate  setpoint,  actual  airflow  rate, 
and  occupancy  status.  Entering  air  temperature  is  typically  not  available,  so  supply  air 
temperature  (available  over  the  control  network  from  the  AHU  controller)  could  be  used.  Many 
VAV  boxes  are  equipped  with  a discharge  air  temperature  sensor,  which  VP  ACC  needs  in  order 
to  calculate  the  reheat  coil  differential  temperature  error.  If  a discharge  air  temperature  sensor  is 
not  available,  a simplified  version  of  VP  ACC  could  be  used,  implementing  the  airflow  rate  error 
and  the  temperature  error  only. 

2.2.6  Parameters 

For  each  process  error  to  which  CUSUM  analysis  is  to  be  applied,  there  is  a set  of  parameters 
that  must  be  known  and/or  specified.  These  are  the  expected  value  of  the  process  error  ( x ),  the 
process  error  standard  deviation  {a),  the  slack  parameter  (, k ),  and  the  alarm  limits  for  the  S and  T 
CUSUMs  (hs  and  hj).  For  the  purposes  of  this  study,  the  expected  value  and  standard  deviation 
of  the  process  error  were  determined  by  analysis  of  a short  period  of  fault-free  operation  from  a 
particular  data  source.  CUSUM  analysis  was  performed  for  each  error  using  an  expected  value 
and  standard  deviation  representative  of  the  VAV  boxes  from  each  site.  These  parameters  will  be 
referred  to  as  the  VPACC  statistical  parameters  throughout  the  remainder  of  this  paper.  The 
slack  parameter  k = 3 and  alarm  limits  hs  = hr  = 900  are  the  same  for  all  data  sources.  To  exceed 
the  alarm  limit  value  using  one  min  data,  an  error  that  is  five  standard  deviations  from  the  mean 
would  have  to  persist  for  7.5  h.  When  a CUSUM  does  exceed  the  alarm  limit,  it  is  reset  to  zero 
and  the  calculations  resume.  Each  CUSUM  is  also  reset  to  zero  during  unoccupied  periods  (and 
during  the  first  hour  of  occupancy,  to  allow  steady  state  conditions  to  develop).  Thus,  the 
severity  of  a fault  can  be  established  from  the  number  of  alarms  over  a period  of  time. 

2.2.7  Special  Cases 

2.2.7. 1 No  Discharge  Air  Temperature  Sensor 

Many  VAV  boxes  are  equipped  with  a discharge  air  temperature  sensor,  which  VPACC  needs  to 
calculate  the  reheat  coil  differential  temperature  error.  If  a discharge  air  temperature  sensor  is  not 
available,  a simplified  version  of  VPACC  could  be  used,  implementing  the  airflow  rate  error,  the 
absolute  value  of  the  airflow  rate  error,  and  the  temperature  error  only.  In  this  case,  a leaking 
reheat  coil  valve  (or,  in  the  case  of  electric  reheat,  staged  reheat  enabled  “on”  in  the  cooling 
mode)  would  not  be  detected  unless  it  was  so  extreme  that  the  VAV  box  was  unable  to  maintain 
the  zone  temperature  at  the  set  point,  thereby  causing  alarms  due  to  excessive  values  of  St- 
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2.2.12  Pressure  Dependent 

In  some  VAV  boxes,  the  damper  is  controlled  directly  in  response  to  zone  temperature  without 
an  intermediate  determination  of  an  airflow  setpoint.  QerWr  and  I QernJ  do  not  exist  for  a pressure 
dependent  VAV  box.  In  this  case,  a stuck  damper  may  go  undetected.  In  the  case  where  the  zone 
is  overcooled,  the  reheat  coil  valve  will  open  (or  staged  reheat  will  be  enabled  “on”  if  electric 
reheat  is  employed)  and  compensate  for  the  fault,  masking  its  existence.  In  the  case  where  the 
zone  is  undercooled,  the  rising  zone  temperature  may  create  alarms  due  to  excessive  values  of  St- 

2.2.7.3  No  Reheat 

Some  VAV  boxes  do  not  have  reheat  capabilities.  Others  do  not  have  reheat  available  part  of  the 
year  because  a two  pipe  hydronic  system  is  being  used  for  chilled  water  at  that  time.  Since  the 
VAV  box  cannot  take  any  control  action  to  increase  zone  temperature,  a negative  temperature 
error  does  not  necessarily  indicate  a fault.  In  this  situation,  only  the  St  CUSUM  will  be 
calculated  for  Termr. 

2.2.1  A Dual  Duct 

In  a dual  duct  VAV  box,  there  is  no  reheat  coil  (and  no  electric  reheat).  Instead  there  are  two  air 
inlets,  namely,  a cold  deck  and  a hot  deck.  Each  air  inlet  has  a damper  and  differential  pressure 
sensor.  For  this  arrangement,  two  airflow  errors  ( Qerr«r,hot  and  Qem>r,coid ) and  two  absolute  value 
airflow  errors  (I Qerror,  iuJ  and  I Qerror,  cold)  will  be  calculated.  No  ATerror  will  be  calculated  as  there 
is  no  reheat  capability. 


3 Data  Sources 

3.1  SITE-1 

SITE-1  is  a large  federal  office  building  in  California.  Air  handling  unit  data  were  collected 
from  three  constant-volume  AHUs  with  enthalpy-based  economizers.  VAV  box  data  were 
collected  from  eight  single-duct,  pressure  independent,  cooling-only  VAV  boxes.  Stand-alone 
software  was  configured  by  facility  personnel  to  automatically  collect  data  from  the  appropriate 
equipment  controllers  at  5 min  intervals,  generate  trend  data  files,  and  e-mail  those  files  to  the 
researchers. 

3.2  SITE-2 

SITE-2  is  a restaurant  with  a pre-existing  arrangement  for  remote  monitoring.  The  monitoring 
firm  agreed  to  provide  the  researchers  with  1 min  data  collected  from  one  constant-volume  AHU 
with  a temperature-based  economizer.  Only  those  points  already  being  trended  were  available. 
Mixed  air  temperature  was  not  one  of  those  points.  Although  the  economizer  is  controlled  to 
maintain  the  supply  air  temperature  at  a setpoint,  the  heating  and  cooling  coil  valves  are 
controlled  to  maintain  zone  temperature,  not  supply  air  temperature,  at  a setpoint.  Therefore,  the 
supply  air  temperature  setpoint  is  not  a meaningful  quantity  in  all  modes  of  operation.  In 
addition,  the  zone  temperature  setpoint,  another  point  not  being  trended,  is  varied  based  on  a 
fixed  schedule.  The  zone  temperature  setpoint  is  also  occupant  adjustable.  A modified  version 
of  APAR  was  developed  in  which  the  modes  are  determined  as  described  earlier,  but  only  a 
subset  of  the  rules,  those  not  involving  mixed  air  temperature  or  supply  air  temperature  setpoint, 
were  evaluated. 

3.3  SITE-3 

SITE- 3 is  a community  college  campus.  Data  were  collected  from  nine  single-duct,  pressure 
independent  VAV  boxes  with  hydronic  reheat.  The  reheat  coils  are  supplied  with  hot  water  by  a 
two-pipe  system  (a  single  piping  system  is  used  for  chilled  water  during  cooling  season  and  hot 
water  during  heating  season).  During  the  cooling  season  (the  time  period  during  which  the  data 
was  collected)  hot  water  for  reheat  is  not  available,  so  the  VAV  boxes  were  treated  as  if  they 
were  cooling-only.  The  building  control  system  was  configured  by  facility  personnel  to  trend 
data  from  the  appropriate  equipment  controllers  at  one  min  intervals. 

3.4  SITE-4 

SITE-4  is  a university  campus.  It  was  originally  intended  that  the  appropriate  data  would  be 
trended  using  the  university’s  advanced  control  system,  which  would  allow  the  researchers  to 
access  the  equipment  controllers  via  the  Internet.  Due  to  time  and  personnel  constraints,  it  was 
not  possible  to  establish  this  mode  of  data  collection.  However,  three  weeks  of  preliminary  1 
min  data  were  trended  by  facility  personnel  and  made  available  to  the  researchers.  Data  were 
collected  from  two  variable-air-volume  AHUs  with  enthalpy-based  economizers,  as  well  as  from 
eight  pressure  independent  VAV  boxes  with  hydronic  reheat.  These  systems  operate  24  hours 
per  day,  7 days  per  week.  There  is  no  occupied/unoccupied  scheduling. 
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4 Results 

Examples  of  faulty  operation  are  presented  from  each  of  the  sites. 

4.1  SITE-1 

4.1.1  AHU  Supply  Air  Temperature  Fault 

Figure  4.1  shows  a plot  of  temperature  and  control  signal  data  from  one  of  the  AHUs  at  SITE-1, 
labeled  AHU-A.  Based  on  the  control  signals  (cooling  coil  valve  is  black,  heating  coil  valve  is 
not  shown,  but  remains  fully  closed,  mixing  box  damper  is  not  shown,  but  remains  aligned  for 
minimum  ventilation)  APAR  determines  that  AHU-A  is  operating  in  Mode  4 (mechanical 
cooling  with  minimum  outdoor  air),  then  applies  the  rules  for  this  mode.  One  of  the  rules  for 
Mode  4 is  Rule  19,  which  states  that  if  the  average  cooling  coil  valve  control  signal  is  fully  open 
(within  1 %)  and  the  difference  between  supply  air  temperature  and  supply  air  temperature 
setpoint  is  greater  than  1.7  °C  (3.0  °F),  the  cooling  coil  valve  is  saturated  and  a persistent  supply 
air  temperature  error  exists.  Figure  4. 1 shows  the  supply  air  temperature  (blue),  varies  from  9 °C 
to  13  °C  (50  °F  to  55  °F)  and  the  supply  air  temperature  setpoint  (red)  is  fixed  at  7 °C  (45  °F)  - 
an  unreasonably  low  value  for  this  application.  Clearly,  the  supply  air  temperature  error  is 
greater  than  1.7  °C  (3.0  °F),  therefore  APAR  has  detected  a fault.  Facility  personnel  confirmed 
that  the  fault  was  the  result  of  inappropriate  operator  intervention. 


Figure  4.1  SITE-1  AHU-A  Supply  Air  Temperature  Fault 
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4.1.2  AHU  Mode  Switch  Fault 


Figure  4.2  shows  a plot  of  the  control  signals,  including  the  cooling  coil  valve  (blue),  heating 
coil  valve  (red),  and  mixed  air  damper  (green),  from  another  of  the  AHUs  at  SITE-1,  labeled 
AHU-B.  Of  interest  is  the  3 h period  from  420  min  until  600  min  after  the  beginning  of  the 
occupied  period.  During  this  3 h time  span,  APAR  observes  AHU  operation  in  three  different 
modes:  Mode  2 (cooling  with  outdoor  air  - heating  coil  valve  less  than  1 % open,  cooling  coil 
valve  less  than  1 % open,  mixed  air  damper  greater  than  26  % open  [25  % is  the  position  for 
minimum  outdoor  air  fraction  to  meet  ventilation  requirements  plus  a 1 % threshold]),  Mode  3 
(mechanical  cooling  with  100  % outdoor  air  - heating  coil  valve  less  than  1 % open,  cooling  coil 
valve  greater  than  1 % open,  mixed  air  damper  greater  than  99  % open),  and  Mode  4 
(mechanical  cooling  with  minimum  outdoor  air  - heating  coil  valve  less  than  1 % open,  cooling 
coil  valve  greater  than  1 % open,  mixed  air  damper  less  than  26  % open).  Rule  28  is  evaluated 
regardless  of  mode,  stating  that  if  more  than  seven  mode  switches  are  recorded  in  1 h,  a fault  has 
been  detected.  In  this  case,  the  AHU  switches  between  modes  26  times  over  the  3 h period, 
satisfying  Rule  28  for  each  of  the  3 h.  The  most  probable  cause  of  this  fault  is  incorrect  tuning 
of  the  temperature  control  PID  loop  in  the  AHU  controller.  A follow  up  with  facility  personnel 
confirmed  the  existence  of  the  fault  as  well  as  the  diagnosis. 


Figure  4.2  SITE-1  AHE-B  Mode  Switch  Fault 


4.1.3  AHU  Simultaneous  Heating  and  Cooling  Fault 

Figure  4.3  shows  a plot  of  the  control  signals  from  another  of  the  AHUs  at  SITE-1,  labeled 
AHU-C.  During  a 25  min  time  span  starting  230  min  after  the  beginning  of  the  occupied  period, 
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the  heating  coil  valve  (red)  is  open  while  the  mixing  box  damper  (green)  is  100  % open.  For  a 
10  min  period  during  this  time  span  (from  235  min  until  245  min)  both  the  heating  coil  valve  and 
the  mixing  box  damper  are  100  % open.  This  combination  of  control  signals  is  inconsistent  with 
any  known  mode,  so  APAR  classifies  this  period  of  operation  as  Mode  5 (unknown  mode)  and 
evaluates  the  rules  associated  with  this  mode.  Rule  23,  one  of  the  rules  associated  with  Mode  5, 
states  that  if  the  average  position  of  the  heating  coil  valve  is  greater  than  1 % and  the  average 
position  of  the  mixing  box  damper  is  more  than  26  % (1  % more  than  the  minimum  for 
ventilation),  then  a fault  has  been  detected,  since  the  AHU  is  simultaneously  heating  and 
cooling/economizing.  The  most  probable  cause  of  this  fault  is  an  AHU  controller  sequencing 
error.  A follow  up  with  facility  personnel  confirmed  this  diagnosis.  The  facility  personnel  then 
modified  the  controller  sequencing  logic.  After  the  modification,  the  fault  was  not  detected 
again. 


Figure  4.3  SITE-1  AHU-C  Simultaneous  Heating  and  Cooling  Fault 

4.1.4  AHU  Temperature  Sensor  / Damper  Fault 

Figure  4.4  shows  another  plot  of  temperatures  and  control  signals  from  AHU-C.  From  220  min 
until  700  min  after  the  beginning  of  the  occupied  period  (approximately  an  eight  hour  time  span), 
the  mixed  air  damper  (green)  is  positioned  for  100  % outdoor  air.  The  cooling  coil  valve  (not 
shown)  ranges  from  fully  closed  to  fully  open  and  the  heating  coil  valve  (not  shown)  remains 
fully  closed.  Based  on  this  combination  of  control  signals,  APAR  places  AHU-C  in  Mode  3 
(mechanical  cooling  with  100  % outdoor  air)  and  evaluates  the  rules  for  Mode  3.  One  of  these  is 
Rule  10,  which  states  that  if  the  outdoor  air  and  mixed  air  temperature  differ  by  more  than  1 .7  °C 
(3.0  °F),  a fault  has  been  detected  since  the  outdoor  air  and  mixed  air  temperature  should  be  the 
same  when  the  mixed  air  damper  is  positioned  for  100  % outdoor  air.  During  this  time,  the 
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mixed  air  temperature  (brown)  remains  approximately  3 °C  (5  °F)  greater  than  the  outdoor  air 
temperature  (blue),  satisfying  Rule  10.  The  most  probable  causes  of  this  fault  include  a mixed 
air  or  outside  air  temperature  sensor  error  or  a mixed  damper  leakage  or  actuator  failure.  A 
follow  up  with  facility  personnel  revealed  that  the  fault  was  caused  by  the  location  of  the  outdoor 
air  temperature  sensor  on  the  roof  of  the  building,  remote  from  the  AHU  outside  air  intake. 
Although  the  location  of  the  sensor  is  less  than  ideal,  it  was  not  possible  for  the  building  staff  to 
relocate  it. 


Figure  4.4  SITE-1  AHU-C  Temperature  Sensor  / Damper  Fault 
4.1.5  VAV  Box  Zone  Temperature  Fault 

Figure  4.5  shows  a plot  of  the  zone  temperature  and  the  related  CUSUM  from  one  of  the  VAV 
boxes  at  SITE-1,  labeled  VAV  Box  - A.  Approximately  500  min  after  the  beginning  of  the 
occupied  period  the  supply  air  temperature  from  the  air  handling  unit  (brown)  begins  to  rise, 
reaching  a peak  of  32  °C  (90  °F)  at  880  min.  The  zone  temperature  (green)  closely  tracks  the 
supply  air  temperature  from  the  AHU,  peaking  at  approximately  the  same  temperature  and  at 
approximately  the  same  time.  In  this  operating  region  the  zone  temperature  error  is  defined  as 
the  difference  between  the  zone  temperature  and  the  cooling  setpoint.  This  zone  temperature 
error  is  normalized,  using  sitewide  statistical  parameters  determined  from  one  month  of  training 
data  from  SITE-1,  by  subtracting  the  average  zone  temperature  error  (0.34  °C  [0.61  °F]),  then 
dividing  by  the  standard  deviation  (1.0  °C  [1.8  °F]).  The  slack  parameter  k is  set  at  three,  so  at 
630  min,  when  the  normalized  zone  temperature  error  increases  to  more  than  three,  the  positive 
temperature  CUSUM  (STemp,  purple)  begins  to  increase.  S Temp  exceeds  the  alarm  limit  h 
(set  at  180,  which,  with  data  at  5 min  intervals,  corresponds  to  an  error  five  standard  deviations 
greater  than  average  for  7.5  h)  at  1000  min  after  the  beginning  of  the  occupied  period.  Although 
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building  staff  were  notified  of  the  detection  of  this  fault,  they  were  unable  to  investigate  further. 
Clearly  the  cause  of  the  problem  is  the  high  AHU  supply  air  temperature,  however  it  is  unknown 
if  there  was  an  AHU  fault.  It  is  possible  that  the  AHU  supply  air  temperature  setpoint  was  high 
due  to  heating  demand  from  other  VAV  boxes  served  by  the  AHU.  This  may  be  a system  level 
fault,  resulting  from  the  hierarchical  relationship  between  different  pieces  of  equipment  in  the 
HVAC  system. 


Figure  4.5  SITE-1  VAV  Box-A  Zone  Temperature  Fault 


4.2  SITE-2 

4.2.1  AHU  Saturated  Cooling  Coil 

Figure  4.6  shows  a plot  from  an  AHU  at  SITE-2,  labeled  AHU-A.  For  the  first  450  min  of 
occupancy,  the  heating  coil  (red)  remains  closed,  the  mixed  air  damper  (green)  is  aligned  for  the 
minimum  outside  air  fraction,  and  the  cooling  coil  valve  (blue)  is  saturated  at  100  %.  Based  on 
these  control  signals,  APAR  determines  that  AHU-A  is  operating  in  Mode  4 (mechanical  cooling 
with  minimum  outdoor  air).  This  AHU  is  a single  zone  unit  in  which  the  cooling  coil  valve  is 
controlled  to  maintain  zone  temperature  at  the  zone  temperature  setpoint  rather  than  supply  air 
temperature  at  the  supply  air  temperature  setpoint.  One  of  the  rules  associated  with  Mode  4 is 
Rule  20,  which,  as  modified  for  SITE-2,  states  that  if  the  average  cooling  coil  valve  position  is 
greater  than  99  % open,  the  valve  is  saturated  and  any  additional  cooling  load  will  cause  the  zone 
temperature  to  drift.  When  this  mle  is  satisfied,  APAR  declares  a warning  rather  than  a fault, 
since  the  supply  air  temperature  error  cannot  be  evaluated  (see  the  description  of  SITE-2  in  3.2, 
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for  further  explanation).  The  zone  temperature  (brown)  is  maintained  between  20  °C  (68  °F)  and 
22  °C  (72  °F),  a reasonable  range  for  the  application.  There  are  many  possible  diagnoses 
including:  temperature  sensor  error,  cooling  coil  valve  actuator  failure,  inappropriate 

operator/occupant  intervention,  temperature  control  PID  loop  tuning  error,  or  sequencing  logic 
error.  On  the  other  hand,  it  is  entirely  possible  that  this  warning  was  caused  by  some  unusual 
activity  in  the  zone  which  generated  a large  cooling  load.  A follow  up  with  facility  personnel 
was  not  possible. 


Time  (min)  From  Beginning  of  Occupied  Period 


Figure  4.6  Site-2  AHU-A  Saturated  Cooling  Coil  Fault 

4.2.2  AHU  Mode  Switch  Fault 

Figure  4.7  shows  a plot  of  the  control  signals  from  AHU-A,  including  the  cooling  coil  valve 
(blue),  heating  coil  valve  (red),  and  mixed  air  damper  (green).  During  the  first  three  hours  of  the 
occupied  period  APAR  observes  AHU-A  cycle  between  two  different  modes:  Mode  2 (cooling 
with  outdoor  air  - heating  coil  valve  less  than  1%  open,  cooling  coil  valve  less  than  1%  open, 
mixed  air  damper  open  more  than  21%  [20%  is  the  minimum  for  ventilation])  and  Mode  3 
(mechanical  cooling  with  100  % outdoor  air  - heating  coil  valve  less  than  1%  open).  Rule  28  is 
evaluated  regardless  of  mode,  stating  that  if  more  than  seven  mode  switches  are  recorded  in  an 
hour,  a fault  has  been  detected.  In  this  case,  the  AHU  switches  between  the  two  modes  72  times 
during  the  three  hour  period,  satisfying  Rule  28  for  each  of  the  three  hours.  The  most  probable 
cause  of  this  fault  is  either  a sequencing  error  or  incorrect  tuning  of  the  temperature  control  PID 
loop  in  the  AHU  controller.  The  existence  of  the  fault  was  confirmed  by  the  monitoring  firm  and 
reported  to  the  facility  personnel,  who  were  unable  to  investigate  further. 


26 


120 


Figure  4.7  Site-2  AHU-A  Mode  Switch  Fault 

4.3  SITE-3 

Results  from  20  weeks  of  field  testing  for  each  of  the  nine  VAV  boxes  are  summarized  in  Table 
4. 1 . The  number  reported  in  a cell  represents  the  total  number  of  alarms  over  the  20  week  period 
of  a particular  CUSUM  and  a particular  VAV  box.  The  total  number  of  alarms  for  each  box  are 
also  reported  in  the  last  column  of  Table  4.1.  Note  that  the  total  excludes  Sq  and  Tq  because  they 
are  in  essence  a subset  of  S\q\.  That  is,  an  alarm  of  Sq  or  Tq  will  always  produce  an  alarm  of  S\q\, 
although  S\q\  may  alarm  first  if,  for  instance,  the  airflow  errors  are  predominantly  positive  but 
include  some  large  negative  errors  as  well.  Because  reheat  was  not  available,  VP  ACC  did  not 
monitor  the  zone  temperature  error  in  the  heating  mode. 

The  VP  ACC  results  in  Table  4.1  indicate  that  VAV  boxes  E and  I are  performing  poorly  and 
VAV  boxes  F,  G,  and  H are  performing  well.  The  other  VAV  boxes  are  performing  somewhere 
between  these  two  extremes.  Closer  inspection  of  the  trended  data  for  VAV  box  E reveals  that 
the  zone  temperature  routinely  exceeds  the  cooling  set  point  of  22.2  °C  (72  °F)  by  2.2  °C  (4  °F)  to 

3.3  °C  (6  °F).  This  produces  70  alarms  of  SV  per  week  on  average  and  nine  weeks  with  100  to  1 15 
alarms  of  St.  It  is  common  for  the  discharge  air  temperature  to  this  zone  to  be  15.6  °C  (60  °F)  to 
21.1  °C  (70  °F),  indicating  that  the  capacity  problems  are  due  at  least  in  part  to  the  manner  in 
which  the  AHU  is  being  controlled  and/or  to  the  design  of  the  system.  Another  VAV  box  (C)  on 
the  same  AHU  routinely  operates  in  the  heating  mode  1.1  °C  (2  °F)  to  1.7  °C  (3  °F)  below  the 
heating  set  point  of  22.2  °C  (72  °F).  The  lack  of  reheat  makes  it  impossible  for  the  AHU  to 
satisfy  the  two  zones  when  one  requires  cooling  and  the  other  heating.  Operational  changes  to 
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reduce  the  minimum  airflow  rates  for  the  heating  mode  may  help  alleviate  the  problem  of 
overcooling  certain  zones. 


Table  4.1 

VPACC  results  for  the  field  data  sets. 


VAV  Box 

Number  of  Alarms 

Tq 

ST 

Tt 

■Siei 

Total  1 

A 

0 

29 

36 

0 

29 

65 

B 

0 

125 

62 

0 

125 

187 

C 

0 

38 

42 

0 

38 

80 

D 

0 

104 

52 

0 

105 

157 

E 

0 

70 

1406 

0 

72 

1478 

F 

0 

1 

0 

0 

1 

1 

G 

0 

0 

1 

0 

0 

1 

H 

0 

10 

3 

0 

10 

13 

I 

0 

674 

3 

0 

674 

677 

1 Total  = ST+  Tt+  Siqi  because  SQ  and  Tq  are  subsets  of  5|g|. 


The  VPACC  results  in  Table  4.1  indicate  that  box  I has  severe  airflow  control  problems.  The 
minimum  number  of  alarms  of  Tq  in  a week  is  seven,  while  the  average  is  close  to  34.  This 
indicates  that  the  flow  rate  to  the  zone  is  consistently  less  than  the  set  point  airflow  rate.  This  is 
true  whether  the  VAV  box  operates  in  the  heating  or  cooling  mode.  This  seems  to  indicate  that 
the  static  pressure  is  not  sufficient  to  deliver  the  amount  of  air  needed  in  the  zone.  Due  to  the 
nature  of  the  flow  error,  Siqi  alarms  at  the  same  times  as  Tq. 

VAV  boxes  F and  G alarm  only  one  time  each  over  the  twenty  weeks  of  testing.  Inspection  of 
the  process  data  supported  the  findings  of  VPACC,  namely,  that  the  control  was  quite  good. 
VAV  box  H also  performed  well,  with  1 1 of  the  13  alarms  occurring  during  the  week  of  June  10- 
16,  2002.  For  much  of  the  first  part  of  that  week,  the  AHU  supply  air  temperature  ranged  from 
18.3  °C  (65  °F)  to  21.1  °C  (70  °F),  causing  the  zone  temperature  to  exceed  its  cooling  set  point  of 
22.2  °C  (72  °F)  by  nearly  1.1  °C  (2  °F).  Hence  the  problem  does  not  appear  to  be  with  the  control 
of  the  VAV  box.  The  controller  operated  in  the  heating  mode  during  most  of  the  testing  and  this 
contributed  to  the  low  number  of  alarms  because  only  flow  control  was  monitored  in  the  heating 
mode. 

The  performance  of  the  remaining  four  VAV  boxes  (A,B,C,  and  D)  is  a little  more  difficult  to 
assess.  A significant  number  of  the  alarms  for  each  occur  during  five  weeks  of  the  testing, 
namely,  May  20-26,  2002,  May  27  to  June  2,  2002,  June  10-16,  2002,  July  8-14,  2002,  and  July 
22-28,  2002.  Specifically,  55  of  65  alarms  for  A,  1 1 1 of  187  alarms  for  B,  75  of  80  alarms  for  C, 
and  101  of  156  alarms  for  D occur  during  those  five  weeks. 

The  problem  in  the  weeks  of  May  20-26,  2002  and  May  27  to  June  2,  2002  is  associated  with  the 
AHU  temperature  control.  All  four  boxes  (as  well  as  E,  which  is  served  by  the  same  AHU)  have 
a significant  number  of  St  alarms  during  these  two  weeks.  Closer  inspection  of  the  trended  data 
indicates  the  discharge  air  temperatures  to  the  zones  exceeded  23.9  °C  (75  °F)  for  nearly  20  h 
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over  the  latter  part  of  the  first  week  and  the  beginning  of  the  second  week.  Since  this  time  period 
encompasses  the  Memorial  Day  holiday,  it  is  likely  that  there  was  a scheduling  inconsistency 
that  had  the  AHU  and  VAV  boxes  operating  in  an  occupied  mode  while  the  chiller  was  not 
running. 

The  problem  in  the  other  three  weeks  is  a sudden  drop  in  the  airflow  through  the  four  boxes. 
Figure  4.8  shows  the  temperature  error  (Terwr)  and  airflow  error  ( Qermr ) for  box  D during  a 
portion  of  the  week  of  July  8-14,  2002.  The  sudden  drop  in  the  airflow  rate  produces  large 
negative  airflow  errors  and  positive  temperature  errors.  As  shown  in  Figure  4.9,  this  results  in 
numerous  airflow  alarms  (Tq)  and  one  temperature  alarm  (St)  ■ This  particular  problem  occurs  at 
six  distinct  times  in  the  three  weeks.  The  problem  occurs  at  the  same  time  in  each  of  the  boxes 
and  also  occurs  in  box  E.  The  fact  that  the  alarms  occur  at  the  same  time  in  each  of  the  boxes 
points  to  the  AHU  as  the  likely  source  of  the  problem. 


Data  Point  (1-min.  Intervals) 

Figure  4.8  Field  data  for  VAV  box  101W  during  the  week  of  July  8,  2002 


Considering  the  remaining  16  weeks  of  data,  the  results  from  VP  ACC  indicate  that  boxes  A and 
C are  performing  fairly  well  (averaging  about  1 alarm  per  week),  while  boxes  B and  D are  not 
doing  as  well.  In  general  boxes  B and  D lack  capacity  in  the  cooling  mode,  as  indicated  by  actual 
airflow  rates  that  are  considerably  less  than  the  set  point  values.  In  the  case  of  box  B,  the 
resultant  temperature  errors  can  be  significant,  sometimes  exceeding  1.1  °C  (2  °F).  Despite  the 
capacity  problems,  temperature  control  in  zone  D is  not  a significant  problem. 
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Figure  4.9  VPACC  output  corresponding  to  the  conditions  in  Figure  7a 

4.4  SITE-4 

4.4.1  AHU  Simultaneous  Heating  and  Cooling  Fault 

Figure  4.10  shows  a plot  of  the  control  signals  from  one  of  the  AHUs  at  SITE-4,  labeled  AHU- 
A.  During  the  first  1 140  min  (19  h)  of  the  day,  the  heating  coil  valve  (red)  varies  between  5 % 
and  35  % open.  Over  the  same  time  period,  the  mixing  box  damper  (green)  varies  between  25  % 
and  35  % open.  This  AHU  has  two  separate  mixing  box  dampers:  one  allows  the  minimum 
amount  of  outdoor  air  for  ventilation,  while  the  other  is  for  cooling  with  outside  air.  The  mixing 
box  damper  position  shown  in  Figure  4.10  refers  to  the  arrangement  for  cooling  with  outside  air. 
The  cooling  coil  is  closed  throughout  the  19  h time  period.  This  combination  of  control  signals 
is  inconsistent  with  any  known  mode  of  operation,  so  this  period  of  operation  is  classified  as 
Mode  5 (unknown  mode)  and  evaluates  the  rules  associated  with  Mode  5.  Rule  23,  one  of  the 
rules  associated  with  Mode  5 and  modified  to  reflect  the  mixing  box  damper  arrangement  of 
AHU-A,  states  that  if  the  average  position  of  the  heating  coil  valve  is  greater  than  1 % and  the 
average  position  of  the  mixing  box  damper  is  greater  than  1 %,  then  a fault  has  been  detected, 
since  the  AHU  is  simultaneously  heating  and  cooling/economizing.  The  most  probable  cause  of 
this  fault  is  either  a sequencing  error  or  a temperature  sensor  error  related  to  the  specific  control 
strategy  implemented  in  this  AHU,  in  which  the  cooling  coil  valve,  heating  coil  valve,  and 
mixing  box  damper  are  each  controlled  by  independent  PID  loops,  along  with  independent 
temperature  sensors  and  setpoints.  Facility  personnel  were  not  able  to  investigate  further. 
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Figure  4.10  SITE-4  AHU  Simultaneous  Heating  and  Cooling  Fault 


4.4.2  VAV  Box  Airflow  Fault 

A sufficient  quantity  of  data  to  generate  statistical  parameters  for  the  SITE-4  VAV  boxes  was 
not  available,  so  parameters  from  SITE-1  were  used  instead.  SITE-1  parameters  were  chosen 
because  SITE-1  and  SITE-4  both  have  single-duct,  pressure-independent  VAV  boxes.  Figure 
4.1 1 shows  a plot  of  zone  airflow  data  and  the  damper  control  signal  from  a VAV  box  at  SITE-4, 
labeled  VAV  Box  - A.  For  this  entire  day,  the  airflow  setpoint  (blue)  is  0.27  m3/s  (580  cfm). 
The  actual  measured  airflow  rate  (red)  is  in  the  region  of  (0.46  ±0.02)  m3/s  [(965  ±50)  cfm].  The 
damper  control  signal  (not  shown)  is  0 %.  The  positive  airflow  CUSUM  (purple)  steadily 
increases,  exceeding  the  alarm  limit  (set  at  900,  which,  with  data  at  1 min  intervals,  corresponds 
to  an  error  five  standard  deviations  greater  than  average  for  7.5  h)  at  1415  min.  A follow  up  with 
facility  personnel  confirmed  the  existence  of  the  fault.  Possible  diagnoses  include:  airflow 
sensor  failure,  damper/actuator  failure,  improper  airflow  control  PID  loop  tuning,  control  logic 
sequencing  error,  or  inappropriate  operator  intervention.  A study  to  determine  robust  sets  of 
statistical  parameters  for  a variety  of  systems  is  needed;  however,  this  example  illustrates  the 
possiblity  of  using  the  CUSUM  approach  without  collecting  training  data  from  each  potential 
site. 
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Figure  4.1 1 SITE-4  VAV  Box  Airflow  Fault 


5 Summary  and  Future  Work 

This  report  presents  the  results  of  a field  study  to  evaluate  APAR,  a rule  based  FDD  tool  for 
AHUs,  and  VPACC,  a statistical  quality  control  based  FDD  tool  for  VAV  boxes.  APAR  consists 
of  a set  of  expert  rules,  derived  from  mass  and  energy  balances.  Control  signals  are  used  to 
determine  the  AHU’s  mode  of  operation,  which  identifies  the  subset  of  the  rules  to  be  evaluated. 
VPACC  uses  a small  set  of  process  errors,  valid  for  most  VAV  box  control  strategies,  to  measure 
VAV  box  performance.  CUSUM  charts,  a statistical  quality  control  tool,  are  used  to  evaluate  the 
process  errors.  Thresholds  are  determined  by  statistical  analysis  of  a database  of  “normal 
operation”  data. 

APAR  and  VPACC  were  evaluated  using  data  from  several  different  sources  - an  office 
building,  a restaurant,  and  community  college  and  university  campuses,  featuring  constant-  and 
variable-air-volume  systems.  Any  evaluation  using  field  data  must  contend  with  some  inherent 
difficulties:  reliance  on  sensor  data  to  discern  the  true  state  of  the  system,  the  inability  to  report 
a “false  positive”  (an  undetected  fault),  and  ambiguity  regarding  what  constitutes  a fault. 
However,  in  this  case  consistent  results  across  diverse  testing  environments  gives  a high  level  of 
confidence  that  the  FDD  tools  will  perform  in  an  even  greater  variety  of  applications.  Several 
faults  have  been  successfully  detected  and  confirmed  by  building  operations  staff.  Every  site  has 
been  found  to  have  at  least  one  fault.  Even  though  the  sample  size  is  small,  these  results  appear 
to  confirm  the  hypothesis  that  faults  of  the  type  that  can  be  detected  by  these  tools  are  common. 
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In  a parallel  study  beyond  the  scope  of  this  report,  the  FDD  tools  were  embedded  in  AHU  and 
VAV  box  controllers  and  evaluated  in  a laboratory  setting.  Follow-on  work  will  require 
partnering  with  control  system  manufacturers  to  conduct  field  tests  of  APAR  and  VP  ACC, 
embedded  in  their  own  controller  products. 

NIST’s  vision  of  full  commercialization  of  automated  fault  detection  and  diagnostics  is  one  in 
which  APAR  and  VP  ACC,  along  with  appropriate  parameters  and  thresholds,  are  packaged 
within  HVAC  control  products.  In  order  for  this  vision  to  become  reality,  more  work  is  needed 
in  three  main  areas.  First,  it  is  impractical  to  expect  trend  data  to  be  evaluated  to  determine  the 
necessary  parameters  and  thresholds  for  each  site,  as  was  done  in  this  study.  Ideally,  sets  of 
robust  parameters  and  thresholds  that  are  effective  across  specified  ranges  of  applications  would 
be  available.  Additional  field  data  from  a wide  variety  of  systems  must  be  collected  in  order  to 
determine  these  robust  parameters  and  thresholds.  Also,  the  current  embedded  FDD  tools  are 
written  using  generic  mathematical  functions  available  in  the  languages  in  which  the  controllers 
are  programmed.  Although  this  approach  is  suitable  for  a technology  demonstration,  built-in 
APAR  and  VP  ACC  functions  would  greatly  simplify  the  task  of  embedding  FDD  in  a control 
program.  Finally,  more  work  is  needed  to  develop  alternative  ways  to  interpret  FDD  results  and 
deliver  this  information  to  the  building  operator.  The  most  direct  approach  is  to  generate  an 
alarm  that  the  operator  must  acknowledge  whenever  a rule  is  satisified  (APAR)  or  a cumulative 
sum  exceeds  the  alarm  limit  (VPACC).  Refinements  to  the  basic  scheme  are  possible.  For 
example,  rather  than  automatically  sending  the  alarm  to  the  operator,  the  building  control  system 
could  highlight,  on  demand,  those  devices  having  experienced  the  greatest  number  of  alarms  in  a 
given  period  of  time.  Or,  if  an  automated  maintenance  management  system  is  used,  an  alarm 
could  automatically  generate  an  appropriate  work  order.  However,  many  faults  are  the  result  of 
design  or  commissioning  issues  that  are  beyond  the  scope  of  the  building  maintenance  staff. 
Furthermore,  a fault  in  another  piece  of  equipment,  such  as  an  air  handling  unit,  boiler,  or  chiller, 
could  result  in  this  approach  generating  a large  number  of  alarms,  perhaps  overwhelming  the 
operator.  A mechanism  is  needed  to  resolve  multiple  conflicting  fault  reports  before  reporting 
them  to  the  operator. 
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